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Abstract. We introduce the graph parameter readability and study it as a function of the number 
of vertices in a graph. Given a digraph D, an injective overlap labeling assigns a unique string to 
each vertex such that there is an arc from x to j/ if and only if x properly overlaps y. The readability 
of D is the minimum string length for which an injective overlap labeling exists. In applications that 
utilize overlap digraphs (e.g., in bioinformatics), readability reflects the length of the strings from which 
the overlap digraph is constructed. We study the asymptotic behaviour of readability by casting it in 
purely graph theoretic terms (without any reference to strings). We prove upper and lower bounds on 
readability for certain graph families and general graphs. 


1 Introduction 

In this paper, we introduce and study a graph parameter called readability, motivated by appli¬ 
cations of overlap graphs in bioinformatics. A string x overlaps a string y if there is a suffix of x 
that is equal to a prefix of y. They overlap properly if, in addition, the suffix and prefix are both 
proper. The overlap digraph of a set of strings 5 is a digraph where each string is a vertex and 
there is an arc from x to y (possibly with x = y) if and only if x properly overlaps y. Walks in 
the overlap digraph of S represent strings that can be spelled by stitching strings of S together, 
using the overlaps between them. Overlap digraphs have various applications, e.g., they are used 
by approximation algorithms for the Shortest Superstring Problem |Swe00| . Their most impactful 
application, however, has been in bioinformatics. Their variants, such as de Bruijn graphs [IW95j 
and string graphs |Mye05| , have formed the basis of nearly all genome assemblers used today (see 
[MKSiniNP13j for a survey), successful despite results showing that assembly is a hard problem 
in theory |BBT13INP09IMGMB07] . In this context, the strings of S represent known fragments of 
the genome (called reads), and the genome is represented by walks in the overlap digraph of S. 
However, do the overlap digraphs generated in this way capture all possible digraphs, or do they 
have any properties or structure that can be exploited? 

Braga and Meidanis [BM02] showed that overlap digraphs capture all possible digraphs, i.e., 
for every digraph D, there exists a set of strings S such that their overlap digraph is D. Their 
proof takes an arbitrary digraph and shows how to construct an injeetive overlap labeling, that is, a 
function assigning a unique string to each vertex, such that {x, y) is an arc if and only if the string 
assigned to x properly overlaps the string assigned to y. However, the length of strings produced 
by their method can be exponential in the number of vertices. In the bioinformatics context, this 
is unrealistic, as the read size is typically much smaller than the number of reads. 

To investigate the relationship between the string length and the number of vertices, we in¬ 
troduce a graph parameter called readability. The readability of a digraph D, denoted r{D), 
is the smallest nonnegative integer r such that there exists an injective overlap labeling of D 
with strings of length r. The result by [BMn2j shows that readability is well defined and is at 
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most — 1, where A is the maximum of the in- and out-degrees of vertices in D. However, 

nothing else is known about the parameter, though there are papers that look at related no¬ 
tions |BFK+n2IBFKK02IBHKdW99l(lP14ILZn7ILZiniPSWn3ITU88] . 

In this paper, we study the asymptotic behaviour of readability as a function of the number 
of vertices in a graph. We define readability for undirected bipartite graphs and show that the 
two definitions of readability are asymptotically equivalent. We capture readability using purely 
graph theoretic parameters (i.e., without any reference to strings). For trees, we give a parameter 
that characterizes readability exactly. For the larger family of bipartite C' 4 -free graphs, we give 
a parameter that approximates readability to within a factor of 2. Finally, for general bipartite 
graphs, we give a parameter that is bounded on the same sets of graphs as readability. 

We apply our purely graph theoretic interpretation to prove readability upper and lower bounds 
on several graph families. We show, using a counting argument, that almost all digraphs and 
bipartite graphs have readability of at least f7(n/logn). Next, we construct a graph family inspired 
by Hadamard codes and prove that it has readability f7(n). Finally, we show that the readability 
of trees is bounded from above by their radius, and there exist trees of arbitrary readability that 
achieve this bound. 

2 Preliminaries 

General definitions and notation. Let x be a string. We denote the length of x by |x|. We use 
x[i] to refer to the character of x, and denote by x[i..j] the substring of x from the to the 

character, inclusive. We let prej(x) denote the prefix x[l..f] of x, and we let sufj(x) denote the 
suffix X [|x| — i + l..|x|]. Let y be another string. We denote by x • 7/ the concatenation of x and y. 
We say that x overlaps y if there exists an i with 1 < z < min{|x|, |y|} such that sufj(x) = pice^{y). 
In this case, we say that x overlaps y by L If z < min{|x|, |y|}, then we call the overlap proper. 
Define ov(x, y) as the minimum z such that x overlaps y by z, or 0 if x does not overlap y. For a 
positive integer n, we denote by [n] the set {!,..., n}. 

We refer to finite simple undirected graphs simply as graphs and to finite directed graphs 
without parallel arcs in the same direction as digraphs. For a vertex z; in a graph, we denote the set 
of neighbors of v hy N(v). A biclique is a complete bipartite graph. Note that the one-vertex graph 
is a biclique (with one of the parts of its bipartition being empty). Two vertices u,v in a graph are 
called twins if they have the same neighbors, i.e., if N{u) = N{v). If, in addition, N{u) = N{v) ^ 0, 
vertices u,v are called non-isolated twins. A matching is a graph of maximum degree at most 1, 
though we will sometimes slightly abuse the terminology and not distinguish between matchings 
and their edge sets. A cycle (respectively, path) on z vertices is denoted by Ci (respectively, Pi). 
For graph terms not defined here, see, e.g., [BM08] . 

Readability of digraphs. A labeling ^ of a graph or digraph is a function assigning a string to 
each vertex such that all strings have the same length, denoted by len{i). We define ovi{u,v) = 
ov{£{u),£{v)). An overlap labeling of a digraph D = (H, A) is a labeling i such that {u,v) G A 
if and only if 0 < ovi{u,v)) < len{i). An overlap labeling is said to be injective if it does not 
generate duplicate strings. Recall that the readability of a digraph D, denoted r{D), is the smallest 
nonnegative integer r such that there exists an injective overlap labeling of D of length r. We note 
that in our definition of readability we do not place any restrictions on the alphabet size. Braga and 
Meidanis |BM02] gave a reduction from an overlap labeling of length I over an arbitrary alphabet 
A to an overlap labeling of length iiog |i7| over the binary alphabet. 
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Readability of bipartite graphs. We also define a modified notion of readability that applies to 
balanced bipartite graphs as opposed to digraphs. We found that readability on balanced bipartite 
graphs is simpler to study but is asymptotically equivalent to readability on digraphs. Let G = 
{V,E) be a bipartite graph with a given bipartition of its vertex set V{G) = 14 U Vp. (We also use 
the notation G = {Vs,Vp, E).) We say that G is balanced if |14| = |14l- An overlap labeling of G is 
a labeling I oi G such that for all u G 14 and v G I 4 , iu,v) € E if and only if ov£{u,v) > 0. In 
other words, overlaps are exclusively between the suffix of a string assigned to a vertex in I 4 and 
the prefix of a string assigned to a vertex in I 4 . The readability of G is the smallest nonnegative 
integer r such that there exists an overlap labeling of G of length r. Note that we do not require 
injectivity of the labeling, nor do we require the overlaps to be proper. As before, we use r{G) to 
denote the readability of G. 

We note that in our definition of readability we do not place any restrictions on the alphabet 
size. Braga and Meidanis [BMn2| gave a reduction from an overlap labeling of length I over an 
arbitrary alphabet E to an overlap labeling of length flog \ E\ over the binary alphabet. 

For a labeling f, we define inneri{i{v)) = sufj(f(u)) if u G I 4 and inneri{£{v)) = prej(f(u)) 
if u G 14- Similarly, we define outeri{£{v)) = prej(f(t')) if u G I 4 and outeri{£{y)) = sufj(f(u)) if 
V G Vp. 

Let Bnxn be the set of balanced bipartite graphs with nodes [n] in each part, and let T>n be 
the set of all digraphs with nodes [n]. The readabilities of digraphs and of bipartite graphs are 
connected by the following theorem, which implies that they are asymptotically equivalent. 


Theorem 2.1. There exists a hijection tp : Bnxn — t Vn with the property that for any G G Bnxn 
and D G Vn, such that D = iIj{G), we have that r{G) < r{D) < 2 • r{G) + 1. 


As a result, we can study readability of balanced bipartite graphs, wit hout asymptotically 
affecting our bounds. For example, we show in Section 14.21 (in Theorem l4.2l ) that there exists a 
family of balanced bipartite graphs with readability 17 (n), which leads to the existence of digraphs 
with readability 17 (n). 


3 Graph theoretic characterizations 

In this section, we relate readability of balanced bipartite graphs to several purely graph theoretic 
parameters, without reference to strings. 


3.1 Trees and C 4 -free graphs 

For trees, we give an exact characterization of readability, while for C' 4 -free graphs, we give a 
parameter that is a 2 -approximation to readability. A decomposition of size k oi & bipartite graph 
G = (14) Vp, E) is a function on the edges of the form w ■. E ^ [k]. Note that a labeling f of G implies 
a decomposition of G, defined by w{e) = ov£(e} for all e G E. We call this the ^decomposition. We 
say that a labeling £ oi G achieves w if it is an overlap labeling and w is the ^decomposition. Note 
that we can express readability as 

r(G) = min{fe | rr is a decomposition of size fe , 3 a labeling £ that achieves w} . 

Our goal is to characterize in graph theoretic terms the properties of w which are satisfied if and 
only if w is the ^decomposition, for some £. While this proves challenging in general, we can 
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Fig. 1: Illustration that Theorem l3.21 cannot be extended to graphs with a ( 74 . Example of a graph 
and decomposition that satisfies the strict P 4 -rule, yet no overlap labeling I exists that achieves it. 


achieve this for trees using a condition which we call the P 4 -rule. We say that w satisfies the P 4 - 
rule if for every induced four-vertex path P = ( 61 , 62 , 63 ) in G, the following condition holds: if 
66 ( 62 ) = max{ 6 (;(ei), 66 ( 62 ), 66 ( 63 )}, then 66 ( 62 ) > 'w(ei) + 10 ( 63 ). We will prove: 

Theorem 3.1. Let T he a tree. Then r(T) = min{fe | 66 is a decomposition of size k that 
satisfies the P^-rule}. 

Note that for cycles, the equality does not hold. For example, consider the decomposition w of 
Cq given by the weights 2,4, 2, 2,3,1. This decomposition satisfies the P 4 rule but it can be shown 
using case analysis that there does not exist a labeling i achieving w. 

However, we can give a characterization of readability for ( 74 -free graphs in terms of a parameter 
that is asymptotically equivalent to readability, using a condition which we call the strict Hi-rule. 
The strict Hi-rule is identical to the P 4 -rule accept that the inequality becomes strict. That is, 
66 satisfies the strict P^-rule if for every induced four-vertex path P = ( 61 , 62 , 63 ), if 66 ( 62 ) = 
max{ 66 ( 6 i), 66 ( 62 ), 66 ( 63 )}, then 66 ( 62 ) > i 6 (ei)+ 66 ( 63 ). Note that a decomposition that satisfies the 
strict P 4 -rule automatically satisfies the P 4 -rule, but not vice-versa. We will prove: 

Theorem 3.2. Let G be a Ca- free bipartite graph. Let t = minjA: | 66 is a decomposition of 
size k that satisfies the strict P/^-rule). Then t/2 < r(G) < t. 


We note that this characterization cannot be extended to graphs with a G 4 . The example in 
Figure [l| shows a graph with a decomposition which satisfies the strict Pi-rule but it can be shown 
using case analysis that there does not exists a labeling i achieving this decomposition. 

In the remainder of this section, we will prove these two theorems. We first show that an 
^decomposition satisfies the Pj-rule (proof in the Appendix). 


Lemma 3.1. Let i be an overlap labeling of a bipartite graph G. Then the i-decomposition satisfies 
the Pi-ru/e. 


Now, consider a (74-free bipartite gr aph G = (I 4 , Vp, E) an d let 66 be a decomposition satisfying 
the Pj-rule. We will prove both Theorem l3.ll and Theorem l3.2l by constructing the following labeling. 
Let us order the edges 61 ,... ,e\E\ order of non-decreasing weight. For 0 < j < \E\, we define 
the graph G^ = (Vs,Vp,{ei € E \ i < j}). For a vertex u, define lenj(u) = max{ 66 (ej) | i < 
j,ei is incident with u}, if the degree of u in G^ is positive, and 0 otherwise. We will recursively 
define a labeling ij of G^ such that \ij(u)\ = lenj(u) for all u. The initial labeling (.q assigns e 
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to every vertex. Suppose we have a labeling tj for and e^+i = {u,v). Recall that because w 
satisfies the P 4 -rule and G is G 4 -free, w{u,v) > lenj{u) + lerijiy) = \ij{u)\ + \ij{y)\. (Note that the 
inequality holds also in the case when one of the two summands is 0.) Let be a (possibly empty) 
string of length w{u,v) — \ij{u)\ — \ij{v)\ composed of non-repeating characters that do not exist 
in £j. Define £j+i as £j+i{x) = £j{x) for all x ^ {«, u}, and ij+i{u) = £j+i{v) = ij{v) ■ A ■ ij{u). We 
denote the labeling of G as £ = We will slightly abuse notation in this section, ignoring the 
fact that a labeling must have labels of the same length. This is inconsequential, because strings 
can always be padded from the beginning or end with distinct characters without affecting any 
overlaps. 

First, we state a useful Lemma, that two vertices share a character in the labeling only if they 
are connected by a path (proof in the Appendix). 

Lemma 3.2. Let c be a character that is contained in ij{u) and in ij{v), for some pair of distinct 
vertices. Then there exists a path between u and v in . 

We are now ready to show that £ achieves w for trees, and, if w also satisfies the strict P 4 -rule, 
for G 4 -free graphs. 

Lemma 3.3. Let G be a G^-free bipartite graph and let w be a decomposition that satisfies the 
Pi-rule. Then the above defined labeling £ achieves w if w satisfies the strict Pi-rule or if G is 
acyclic. 

Proof. We prove by induction on j that £j achieves w on GP Suppose that the Lemma holds for 
£j and consider the effect of adding e^+i = {u,v). Notice that to obtain we only change labels 
by adding outer characters, hence, any two vertices that overlap by i in £j will also overlap by i in 
£j+i. Moreover, only the labels of u and v are changed, and an overlap between u and v of length 
w{u,v) is created. It remains to show that no shorter overlap is created between u and v and that 
no new overlap is created involving u or v, except the one between u and v. 

First, consider the case when w{u,v) > \£j{u)\ + \£j{v)\ and so the middle string (A) of the new 
labels is non-empty. Because the characters of A do not appear in £j, we do not create any new 
overlaps except besides the one between u and v and the only overlap between u and v must be of 
length w{u,v) since the characters of A must align. Thus £ji-i achieves w on 

Next, consider the case when w{u,v) = \£j{v)\ (the case when w{u,v) = \£j{u)\ is symmetric). 
In this case, A = e, £j{u) = e, and \£j{v)\ > 0 (since w{u,v) > 0). Suppose for the sake of 
contradiction that there exists a vertex v' ^ v such that {u, v') is not an edge but innerk{£j+i{u)) = 
innerk{£j+i{v')), for some 0 < /c < w{u,v). We know, from the construction of £j, that there 
exists a vertex u' such that w{u',v) = \£j{v)\. We then have innerk{£j{u')) = outerk{£j{v)) = 
innerk{£j+i{u)) = innerk{£j+i{v')) = innerk{£j{v')). By the induction hypothesis, there is an edge 
{u',v') and w{u',v') < k. The edges {u,v), {v,u'), {u',v') form a P 4 , which is also induced because 
G is G 4 -free. Because w{u,v) = w{u',v) > w{u',v') > 0, the i- 4 -rule is violated, a contradiction. 
Therefore no new overlaps are created involving u. To show that there are no overlaps from u to 
V smaller than w{u,v), observe that any such overlap would also be an overlap between u' and v 
that is smaller than w{u',v), contradicting the induction hypothesis. Therefore, £j-\-i achieves w on 

G^+k 

It remains to consider the case when w{u,v) = \£j{u)\ + \£j{v)\ and £j{u) 7 ^ e 7 ^ ^i(^)- We first 
show that this case cannot arise if w satisfies the strict Ri-rule. There must exist edges in G^ of 
weights \£j{u)\ and \£j{v)\ incident with u and v, respectively. These edges, together with {u,v) in 
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the middle, form a P4, which must be induced since G does not contain a C4. Furthermore, {u,v) 
achieves the maximum weight. The strict P4-rule implies 'w{u, v) > \ij{u) \ + \lj{v)\, a contradiction. 

Now, assume that G is acyclic, and suppose for the sake of contradiction that the new labeling 
creates an overlap between v and a vertex u' ^ u (the case of an overlap between u and v' ^ v is 
symmetric). Consider the character c at position \(ij{v)\ + 1 of ij+i{v). The length of the overlap 
between (.j+i{v) and = ij{u') must be greater than \ij{v)\, otherwise it would have been 

an overlap in £j. Thus, ii( u') m ust contain c. By construction of u’s new label, ij{u) must also 
contain c. Applying Lemma l 3 . 2 . there must be a path between u' and u in GG On the other hand, 
the overlap betwe en y and u' spans (t'j(u))[l], and hence ij{v) and ij{u') must share a character. 
Applying Lemma l 3 . 21 . there must exist a path between u' and v in GG Consequently, there exists 
a path from u to u in GG Combining this path with e^+i = (tt, u), we get a cycle in G^^^, which is 
a contradiction. 

Finally suppose, for the sake of contradiction, that £j+i(n) overlaps ij+i{v) by some k < w{u, v). 
By the induction hypothesis, k > \ij{v)\. Consider the last character c of lj[v). It must also appear 
as the inner position i = k—\lj{v)\+l in ij^i{u). Since k < w{u, v) — l, we have i < w{u, v) — \ij{v)\ = 
\lj{u)\, and the inner position in is also the the inner position in lj{u). Applying 

Lemma I3.2I to c in and £j{u), there must exist a path between u and v in GG Combining this 

path with Cj+i = {u,v), we get a cycle in which is a contradiction. □ 

We can now prove Theorems 13.11 and 13.21 


Proof of Theorem AS.A . Let t = min{A; | tc is a decomposition of size k that satisfies the P4-rule}. 
First, let tc be a decomposition of size t satisfying the P4-rule. Lemma I3.3I states that the above 
defined labeling i achieves w and so r{T) < maxpf wp) = t. For the other direction, consider an 
overlap labeling 6 of T of minimum length. By Lemma l 3 .ll . the 6-decomposition satishes the P4-rule. 
Hence, r(T) = len{b) > t. □ 


Proof of Theorem \ 3 . 2 . Let rc be a decomposition of size t satisfying the strict P4-rule. By Lemmal 3 .i 
the above dehned labeling i achieves w and so r{G) < maXe(tCe) = t. On the other hand, let 6 
be an overlap labeling of length r{G). Define w{e) = 2 ovfe(e) — 1 , for all e G E{G). We claim that 
w satisfies the strict Pj-rule, which will imply that t < maxerc(e) = 2 r{G) — 1 . To see this, let 
61,62,63 be the edges of an arbitrary induced P4. Observe that w{e2) = max{u;(ei), r(;(e2), 16(63)} 
if and only if ovb(e2) = max{ovfc(ei),ovb(e2),ovb(e3)}. Furthermore, it c a.n b e algebraicly verihed 
that if ovfe(e2) > ovb(ei) -|-ovfe(e3) then 16(62) > wiei) -1-16(63). By Lemma l 3 .ll . the 6-decomposition 
satisfies the Pi-rule and, therefore, w satishes the strict Pi-rule. □ 


3.2 General graphs 

In the previous subsection, we derived graph theoretic characterizations of readability that are 
exact for trees and approximate for C'4-free bipartite graphs. Unfortunately, for a general graph, it 
is not clear how to co nstruct an overlap labeling from a decomposition satisfying the Pj-rule (as 
we did in Lemma l 3 . 3 l ). In this subsection, we will consider an alternate rule (HUB-rule), which we 
then use to construct an overlap labeling. 

Given G = (I4, Vp,E) and a decomposition w of size k, we dehne Gf, for i G [k], as a graph 
with the same vertices as G and edges given by E{Gf) = {e G E \ w{e) = i}. When w is obvious 
from the context, we will write Gi instead of Gf. Observe that the edge sets of Gf, ..., form a 
partition of E. We say that w satishes the hierarchical-union-of-bicliques rule, abbreviated as the 
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HUB-rule, if the following conditions hold: i) for all i [k], Gf is a disjoint union of bicliques, and 
ii) if two distinct vertices u and v are non-isolated twins in Gf for some i G { 2 , ... ,k} then, for all 
j G [i — 1 ], u and v are (possibly isolated) twins in Gf. An example of a decomposition satisfying 
the HUB-rule is any w : E ^ [k] such that Gf is an (arbitrary) disjoint union of bicliques and 
Gf ,..., Gf are matchings. We can show that the decomposition implied by any overlap labeling 
must satisfy the HUB-rule (proof in the Appendix). 

Lemma 3.4. Let I be an overlap labeling of a bipartite graph G. Then the i-decomposition satisfies 
the HUB-rule. 


We define the HUB number of G as the minimum size of a decomposition of G that satisfies the 
HUB-rule, and denote it by hub[G). Observe that a decomposition of a graph into matchings (i.e. 
each Gf is a matching) satisfies the HUB-rule. By Konig’s Line Coloring Theorem, any bipartite 
graph G can be decomposed into A(G') matchings, where A(G) is the maximum degree of G. Thus, 
hub{G) G [Z\(G)]. Clearly, a graph G has hub{G) = 1 if and only if G is a disjoint union of bicliques. 
The HUB number captures readability in the sense that the readability of a graph family is bounded 
(by a uniform constant independent of the number of vertices) if and only if its HUB number is 
bounded. This is captured by the following theorem: 

Theorem 3.3. Let G be a bipartite graph. Then hub{G) < r{G) < — 1. 

In the remainder of this section, we will prove this theorem. The first inequality directly follows 
from Lemma 13.41 because, by definition of readability, there exists an overlap labeling i of length 
r{G). Then the ^-decomposition of G is of size r[G) and satisfies the HUB-rule, implying hub{G) < 
r{G). To prove the second inequality, we will need to show: 


Lemma 3.5. Let w be a decomposition of size k satisfying the HUB-rule of a bipartite graph G. 
Then there is an overlap labeling of G of length 2^ — 1. 


The second inequality of Theorem l3.3l follows directly by choosing a minimum decom posit ion 
satisfying the HUB-rule, in which case k = hub{G). Thus, it only remains to prove Lemma l3. 5 . 

We now define the labeling t that is used to prove Lemma l3.5l . Our construction of the labeling 
applies the following operation due to Braga and Meidanis [BMn2| . Given two vertices u GVg and 
V G Vp, a labeling t, and a filler character a not used by t, the BM operation transforms t by 
relabeling both u and v with t{v) ■ a ■ t{u). 

We start by labeling Gi as follows: each biclique B in Gi gets assigned a unique character as, 
and each node u in a biclique B gets label t{v) = 03 . Next, for f G [A: — 1], we iteratively construct 
a labeling of Gi U • • • U Gj+i from a labeling f of Gi U • • • U Gj. We show by induction that the 
constructed labeling has an additional property that all twins in Gi U • • • U Gj+i have the same 
labels and that the length of the labeling is 2*"*“^ — 1. Observe that the labeling of Gi satisfies this 
property. 

We choose a unique (not previously used) character as for each biclique B of Gj+i. If B consists 
of a single vertex v, then we assign to v the label OB-tfv) if u G I 4 , and t{v)-aB if u G Vp. Otherwise, 
since w satisfied the HUB-rule, all vertices in B 014 are twins in Gi U • • • U Gj and, by the induction 
hypothesis, are assigned the same labels in t. Analogously, t will assign the same labels to all nodes 
in BnVp. Consider an arbitrary edge (u, v) in B. We apply the BM operation with character as to 
(u, v) and assign the resulting label t{v) -ub- t{u) to all nodes in B. This completes the construction 
of labeling of Gi U • • • U Gj+i. Observe that it assigns the same labels to all twins in Gi U • • • U Gj+i, 
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and that the length is — 1. To complete the proof of Theorem Is.d . we show in the Appendix 
that the final labeling is an overlap labeling of G. 

Note that if u) is a decomposition into matchings, then our labeling algorithm behaves identically 
to the Braga-Meidanis (BM) algorithm [BMn2j . However, in the case that w is of size o{A(G)), 
our labeling algorithm gives a better bound than BM. For example, for the n x n biclique, our 
algorithm gives a labeling of length 1, while BM gives a labeling of length 2"' — 1. 

4 Lower and upper bounds on readability 

In this section, we prove several lower and upper bounds on readability, making use of the charac¬ 
terizations of the previous section. 

4.1 Almost all graphs have readability 17(n/logn) 

In this subsection, we show that, in both the bipartite and directed graph models, there exist 
graphs with readability at least i7(n/logn), and that in fact almost all graphs have at least this 
readability. 

Theorem 4.1. Almost all graphs in Bnxn (and, respectively, Vn) have readability l7(n/logn). 
When restrieted to a eonstant sized alphabet, almost all graphs in Bnxn (and, respectively, Vn) have 
readability Q{n). 

Proof (constant sized alphabet case). We prove the lemma by a counting argument. Since there are 
pairs of nodes in [n]^ that can form edges in a graph in Bnxn, the size oi Bnxn is 2” . Let a be the 
size of the alphabet. The number of labelings of 2n nodes with strings of length s is at most 
In particular, labelings of length s = n/(31oga) can generate no more than = 2^” 

bipartite graphs, which is in o(2"' ). Consequently, almost all graphs in Bnxn have readability 
I7(s) = l7(n/loga) = Q{n). The proof for Vn is analogous and is omitted. The proof for variable 
sized alphabets is given in the Appendix. □ 

4.2 Distinctness and a graph family with readability fl{n) 

In this subsection, we will give a technique for proving lower bounds and use it to show a family of 
graphs with readability i7(n). For any two vertices u and v, the distinctness of u and v is defined 
as DT{u,v) = max{|A(u) \ A(n)|, |A(n) \ A'(m)|}. The distinctness of a bipartite graph G, denoted 
by DT{G), is defined as the minimum distinctness of any pair of vertices that belong to the same 
part of the bipartition. The following lemma relates the distinctness and the readability of graphs 
that are not matchings (for a matching, the readability is 1, provided that it has at least one edge, 
and 0 otherwise). 

Lemma 4.1. For every bipartite graph G that is not a matehing, r{G) > DT{G) + 1. 

Proof. By Theorem 13.31 it suffices to show that DT{G) < hub{G) — I. Let h = hub{G), let w : 
E{G) —>■ [h] be a minimum decomposition of G satisfying the HUB-rule, and consider the graphs 
Gi = Gf, for i G [h]. We need to show that DT{G) < h — 1. Suppose first that each Gi is a 
matching. Then, since re is a decomposition of G, we have A{G) < h. Moreover, since G is not a 
matching, it has a pair of distinct vertices, say u and v, with a common neighbor, which implies 
DT{G) < DT{u, v) < A{G) 





Suppose now that there exists an index j G [h] such that Gj is not a matching, and let j be 
the maximum such index. Then, there exist two distinct vertices in G, say u and v, that have a 
common neighbor in Gj, and therefore belong to the same biclique of Gj. It follows that u and 
V are non-isolated twins in Gj. Since w is satisfies the HUB-rule, this implies that u and v are 
twins in each Gi with f € [j — 1]. Consequently, for each vertex x in G adjacent to u but not to v, 
the unique Gj with {u,x) G E{Gi) satisfies i > j. By the choice of j, each such Gj is a matching, 
and hence there can be at most h — j such vertices x. Thus |A^(u) \ ^ h — j and similarly 

|A^(u) \ A^(u)| < h — j, which implies the desired inequality DT{G) < DT{u,v) < h — j < h — 1. □ 


While the distinctness is a much simpler graph parameter than the HUB number, simplicity 
comes with a price. Namely, the distinctness does not share the nice feature of the HUB number, 
that of being bounded on exactly the same sets of graphs as the readability. In Section 1131 we show 
the existence of graphs (specifically, trees) of distinctness 1 and of arbitrary large readability. 

We now introduce a family of graphs, inspired by the Hadamard error correcting code, and apply 
Lemma l4.ll to show that their readability is at least linear in the number of nodes. We define as 
the bipartite graph with vertex sets 14 = {u* | u G {0, 1}^ \ {0^}} and Vp = {vp | u G {0,1}^ \ {0^}} 
and edge set 


, Up 


G 14 ^ 14 


rc 

'^Vs['i\vp\i] = 1 (mod 2)I . 


i=l 


In other words, each vertex has a non-zero A:-bit codeword vector associated with it and two vertices 
are adjacent if the inner product of their codewords is odd. Let n = 2^. Graph Hk has 2(n — 1) 
vertices, all of degree n/2, and thus (n — l)nj2 edges. Figured illustrates 

In the Appendix, we show that every pair of vertices in the same part of the bipartition of Hk 
has exactly n /4 c ommon neighbors. This implies that the distinctness of Hk is n/4. Combining this 
with Lemma luJ, we obtain the following theorem. 


Theorem 4.2. r{Hk) > n/4+1. 


This lower bound also translates to directed graphs: applying Theorem l2.ll . there exists di¬ 


graphs of readability Q[n). A major open question is; Do there exist graphs that have exponential 
readability? We conjecture that they do, and that the graph family Hk has exponential readability. 


9 








However, since distinctness is 0{n), we note that Lemma l4.ll is insufficient for proving stronger 
than f7(n) lower bounds on the readability. 


4.3 Trees 


The purely graph theoretic characterization of readability given by Theorem l3.ll allows us to derive 
a sharp upper bound on the readability of trees. Recall that the eccentricity of a vertex u in a 
connected graph G is dehned as eccciu) = max„gy(c') distGiu,v), where distG{u,v) is the number 
of edges in a shortest path from u to v. The radius of a graph G is defined as the minimum 
eccentricity of a vertex in G, that is radius{G) = min„gy(c') max„gy(c') distciu^v). 


Theorem 4.3. For every tree T, r[T) < radius{T), and this bound is sharp. More precisely, for 
every k > 0 there exists a tree T such that r[T) = radius{T) = k. 


Proof. Let T be a tree. If T = Ki (the one-vertex tree), then radius(T) = r{T) = 0 (note that 
assigning the empty string to the unique vertex of v results in an overlap labeling of T). Now, 
let T be of radius r > 1 and let v G R(T) be a vertex of T of minimum eccentricity (that is, 
eccT{v) = r). Consider the distance levels of T from v, that is, V) = {tc € V{T) \ distT{v,w) = i} 
for i G {0,1,... , r}. Also, for all z G [r], let Ei be the set of edges in T connecting a vertex in Vi-i 
with a vertex in V). Then {Ei ,..., E^} is a partition of E{T) and the decomposition w : E{T) [r] 
given by w{e) = i if and only if e G Llj is well defined. We claim that w satisfies the R^-rule. Let 
P = (ui,U 2 ,U 3 ,U 4 ) be an induced P 4 in T, and let i = w{vi,V 2 ), j = w{v 2 ,V‘i), k = w{v 3 ,V 4 ). 
Suppose that j = max{z, j,/c}. We may assume without loss of generality that V 2 G Vj-i and 
V 3 Vj. Since T is a tree, V 2 is the only neighbor of V 3 in Vj-i, which implies that V 4 G Vj^i and 
consequently k = j + 1, contrary to the assumption j = max{z, j, k}. Thus, the P 4 -rule is trivially 
satisfied for w. By Theorem 13.11 we have r(T) < maXgg£;('r) w{e) = r = radius{T). 

To show that for every k > 0 there exists a tree T with r{T) = radius{T) = k, we proceed by 
induction. We will construct a sequence {{Ti,Vi)}i>o where Tj is a tree, Vi is a vertex in Ti with 
eccTi (vi) < i, the degree of Vi in Ti is i, and r{Ti) = radius{Ti) = i. For i = 0, take (Tq, uq) = (iLi, uq) 
where vq is the unique vertex of Ki. This clearly has the desired properties. For z > 1, take z disjoint 
copies of (Tj-i,Uj-i), say (T/_ 4 ,u^_ 4 ) for j G [z], add a new vertex Vi, and join Vi by an edge to 
each vl_i for j G [z]. Let Tj be the so constructed tree. Clearly, the degree of Vi in Ti is z, and 
eccTiivi) < 1 + eccTiivi-i) <l + (z — l) = z, which implies that radius (Ti) < i. On the other hand, 
we will show that r{Ti) > i, which together with inequality r{Ti) < radius{Ti) will imply the desired 
conclusion radius{Ti) = r{Ti) = z. Suppose for a contradiction that r{Ti) < i. Then, by Lemma [34l 
there exists a decomposition w of Tj of size z — 1 satisfying the P 4 -rule. In particular, this implies 
z > 2. Since the degree of Vi in Tj is z, there exist two edges incident with Vi, say {vi,vl_i) and 
for some j ^ k such that w(vi,vj_^) = w{vi,v^_^). Let wi denote this common value. Let 
X be a neighbor of in T^_i. (Note that x exists since is of degree z — 1 > 1 in T-_^.) Then, 
ix,vj_.^^,Vi,Vi_i) is an induced P 4 in Tj. We claim that w(x,v^_^) > wi. Indeed, if w(x,vj_ 2 ^} < wi 
then we have niax{w(x,vj_ 2 ^},w(vj_^,vi),w(vi,v^_^}} = max{w(x, vj_^}, wi, wi} = wi, while wi ^ 
wi + w(x, u^_ 4 ), contrary to the P 4 -rule. Since x was an arbitrary neighbor of z;^_^ in T/_ 4 , we infer 
that every edge e in incident with satisfies w{e) > wi. In particular, this leaves a set 
of at most z — 2 different values that can appear on these z — 1 edges (the value wi is excluded), 
and hence again there must be two edges of the same weight, say W 2 . Clearly, W 2 > wi and z > 2. 
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Proceeding inductively, we construct a sequence of edges ei, 62 , • • •, e* forming a path in Tj from Vi 
to a leaf and satisfying wi < W 2 < ■ ■ ■ < Wi, where Wi = w{ei). This implies that all the weights 
wi,... ,Wi are distinct, contrary to the fact that the range of w is contained in the set [i — !]• This 
contradiction shows that r[Ti) > i and completes the proof. □ 


Note that for every k > 2, the tree of radius k constructed in the proof of Theorem 13.11 
has a pair of leaves in the same part of the bipartition and is therefore of distinctness 1. This 
shows that the readability of a graph cannot be upper-bounded by any function of its distinctness 
(cf. Lemma l4.ll ). 


5 Conclusion 


In this paper, we define a graph parameter called readability, and initiate a study of its asymptotic 
behavior. We give purely graph theoretic parameters (i.e., without reference to strings) that are ex¬ 
actly (respectively, asymptotically) equivalent to readability for trees (respectively, C' 4 -free graphs); 
however, for general graphs, the HUB number is equivalent to readability only in the sense that it 
is bounded on the same set of graphs. While an ^-decomposition always satisfies the HUB-rule, the 
converse is not true. For example, a decomposition of P 4 wi th w eights 4, 5,3 satisfies the HUB-rule 
but cannot be ac hiev ed by an overlap labeling (by Lemma l3.ll ). For this reason, the upper bound 
given by Lemma T5 leaves a gap with the lower bound of Lemma l3.4l . We are able to describe 
other properties that an ^-decomposition must satisfy (not included in the paper), however, we are 
not able to exploit them to close the gap. It is a very interesting direction to find other necessary 
rules that would lead to a graph theoretic parameter that would more tightly match readability on 
general graphs than the HUB number. 

Consider r(n) = max{r(D) | D is a digraph on n vertices}. We have shown r{n) = f7(n) and 
know from [BM02] that r(n) = 0(2”'). Can this gap be closed? Do there exist graphs with readability 
0 ( 2 "’) (as we conjecture), or, for example, is readability always bounded by a polynomial in n? 
Questions regarding complexity are also unexplored, e.g., given a digraph, is it NP-hard to compute 
its readability? For applications to bioinformatics, the length of reads can be said to be poly- 
logarithmic in the number of vertices. It would thus be interesting to further study the structure 
of graphs that have poly-logarithmic readability. 
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A Appendix: deferred proofs 

A.l Readability of bipartite graphs and digraphs 

In this subsection, we prove the following theorem. 

Theorem 2.1. There exists a bijection V' : Snxn ^ 'with the property that for any G € B^xn 
and D E T>n, such that D = fjiG), we have that r{G) < r{D) < 2 • r{G) + 1. 

Recall that Bnxn is dehned as the set of balanced bipartite graphs with nodes [n] in each part. 
To disambiguate the two partitions, we label the vertices of G = (14, Vp, E) E Bnxn using notation 
Vs = {is \ i ^ Ml and Vp = {ip \ i & [n]}. 

For the proof, we dehne the following transformation. Let D = ([n],^) E Pn- Dehne (f{D) = 
{Vs,Vp,E) as the bipartite graph with Vs = {is \ i ^ N}, Vp = {ip \ i ^ [n]}. and E = {(4,Jp) | 
(4j) E A}. This transformation was proposed in [BM02| . Similarly, we dehne the transformation 
if, as follows. Given a bipartite graph G = (ys,Vp,E) E Bnxn, we dehne if{G) = ([n],A) where 
^ = {(bi) I {'i's,jp) € E{. It is easy to see that V’ is a bijection from Bnxn to Vn, as required, and 
(p is its inverse. 

The following two lemmas prove the readability bounds stated in the theorem. 

Lemma A.l. Let D = {V,A) E T>n he a digraph with A^^. Then r{(f{D)) < r{D). 

Proof. Let I be an injective overlap labeling of D. Since A ^ if, we have len{i) > 1. Dehne a labeling 
of 4>{D) as follows. For w ^ V, let t^{ws) = i{w)[2..\l{w)\] and let i^(wp) = l{w)[l..\l{w)\ — 1]. (If 
|£(u))| = 1, then each of irf,{ws) and irf,{wp) is the empty string.) It is clear that ^0 is a labeling of PiD) 
of length len{£) — l. We claim that is an overlap labeling of 4>{D). Suppose that {us,Vp) E E{(f{D)). 

Then {u,v) E A, which implies ov^(m, u) > 0. Also, ovi{u,v) < len{i). Consequently, the shortest 
overlap between i{u) and £{v) yields an overlap between £^{us) and £^{vp), implying ov£^{us,Vp) > 0. 
Conversely, the condition ov£^{us,Vp) > 0 implies 0 < ovi{u,v) < len{i). Therefore, {u,v) E A and, 
by the dehnition of 4>{D), also {us,Vp) E E{(p{D)). This shows that r{(p{D)) < r{D) — 1. □ 


Lemma A.2. Let G = (14, Vp, E) E Bnxn- Then r{if{G)) < 2 ■ r(G) + 1. 


Proof. Let ic be an overlap labeling of G and let D = {V,A) = ip{G), with V = [n]. For w £ V, 
dehne £{w) = Ici'Wp) ■ w ■ ici'Ws)- Here, w is treated as a character in the alphabet [n]. We assume 
without loss of generality that these characters are distinct from the alphabet over which to is 
dehned. It is clear that £ is a labeling of D of length 2 • len{iG) + 1- We claim that i is an injective 
overlap labeling of D. For every vertex w £ V, its label contains a distinct middle character 
corresponding to w, which implies injectivity. Now, suppose that {u,v) £ A. Then {us,Vp) £ E, 
which implies ovi^{us,Vp) > 0. By construction of i, it follows that 0 < ovi{u, v) < len{£G) < len{£). 
Conversely, suppose that ov£{u,v) > 0. By construction of £, it follows that ov£{u,v) < len{£G). 
Therefore, ov£^{us,Vp) = ov£{u,v) > 0, which implies {us,Vp) £ E and consequently {u,v) £ A. 
This shows that r{if{G)) < 2 • r(G) + 1. □ 


Given G E Bnxn, 


we can apply the two lemmas to derive the inequality of Theorem 


2.1 


r(G) = r(0(V^(G)) < r(V’(G)) < 2 • r(G) + 1. 
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A.2 Trees and C' 4 -free graphs 

Lemma 3.1. Let £ be an overlap labeling of a bipartite graph G. Then the £-decomposition satisfies 
the P^-rule. 

Proof. Let G = (I 4 , Vp,E). Denote by w be the ^decomposition. Suppose for the sake of contradic¬ 
tion that w violates the P 4 -rule. Then, there exists an induced four-vertex path P = (ui, U 2 , U 3 , U 4 ) 
in G with ui G Vp (and consequently tt 2 , U 4 G Vg and U 3 G Vp) such that m.ax.{w{vi,V 2 ),w{v 2 ,vfi} 
w{v 2 ,V 3 ) < w{vi,V 2 ) -|- w{v 3 ,Vi). Then, b = max{a, 6 , c} and b < a + c, where a = ov£{u 2 ,ui), 
b = ove{u 2 ,U 3 ), and c = ove{u 4 ,U 3 ). We will show that there exists an overlap from £{ui) to £{u 4 ) 
of length a -|- c — 6 , which will prove the lemma, by contradicting the fact that £ is an overlap 
labeling and (n 4 ,ui) ^ E {as P is an induced P 4 ). 

Let r be the length of £. Writing the overlaps in terms of substrings, we obtain that sufa(^(u 2 )) = 
prea(£(ui)), sufb(^(tt 2 )) = preb(£(u 3 )), and sufc(^(tt 4 )) = prec(£(u 3 )). Let d = a + c — b. Note that 
1 < d < min{a,c}. Applying the equalities, we get pre^(£(ui)) = £{u 2 )[r — a + l..r — a + d] = 
£{u 3 )[c — d + l..c] = sufrf(£(u 4 )), establishing the existence of the desired overlap. □ 

Lemma 3.2. Let c be a character that is contained in £j{u) and in £j{v), for some pair of distinct 
vertices. Then there exists a path between u and v in G^. 

Proof. We prove the statement by induction on m G {0,1,... , |£'|}. For the base case, £0 does not 
label any positions. Now, assume that £m satisfies the lemma and consider the new positions labeled 
by £m+i, with Cm+i = {u,v). Recall that A is a possibly empty string of new characters inserted 
into the middle of the new labels. A position of u labeled with a character from A is adjacent to 
the position of v labeled with the same character, and since the characters are new, these are the 
only two positions labeled with this character. Now, each new position of u that is not labeled 
with a character from A is labeled with a character from £miv). By the induction hypothesis, v is 
connected by a path to all vertices with occurrences of the same character in G'^, which implies 
the same statement for u in (using the fact that E{G"^~^^) = E{G'^) U {cm+i})- The case of 

the new characters in the label of v is symmetric. □ 

A.3 General graphs 

Lemma 3.4. Let £ be an overlap labeling of a bipartite graph G. Then the £-decomposition satisfies 
the HUB-rule. 

Proof. Denote the vertices and edges of the graph as usual: G = {Vs,Vp, E). Consider the £- 
decomposition. Fix i G [k]. First, we show that Gi is a union of disjoint bicliques. Observe that 
a bipartite graph is a disjoint union of bicliques if and only if it contains no induced P 4 , where 
a P 4 denotes the path on 4 vertices and 3 edges. Therefore, it suffices to prove that Gi does 
not contain any induced P 4 . Consider a 4-vertex path {u,x,y,z) in Gi. We will show that Gi 
contains the edge {u, z). Since each edge of the path is in Gi, the corresponding overlaps imply that 
inneri{£{u)) = inneri{£{x)) = inneri{£{y)) = inneri{£{z)). Thus, inneri{£{u)) = inneri{£{z)). To 
complete the proof that {u,z) G E{Gi), it remains to show that {u,z) ^ E{Gj) for all j G [i — 1]. 
For the sake of contradiction suppose inner j{£{u)) = inner j{£{z)) for some j G [i — 1]. Then 
inner j{£{u)) = inner j{£{x)) and, consequently, (u,x) is in E{Gj), which contradicts that it is in 
E{Gi). Therefore, {u,z) G E{Gi). This completes the proof that Gi is a disjoint union of bicliques. 
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Next we show that the ^-decomposition is hierarchical, i.e. satisfies the second condition of the 
HUB-rule definition. Fix i € {2,..., k} and consider two non-isolated twins u, v in Gi. By definition 
of non-isolated twins, there is a vertex z that is adjacent to both u and v in Gi. By definition of Gi, 
we get inneri{i{u)) = inneri{i{z)) = inneri{i{v)). Therefore, for all j G [i — 1], the corresponding 
inner affixes of labels of u and v are the same: innerj{i{u)) = innerj{i{v)). Consequently, in Gj, 
every neighbor of u must be a neighbor of v, and vice versa. That is, u and v are twins in Gj for 
all j € [f — 1], completing the proof of the lemma. □ 


Lemma 3.5. Let w be a decomposition of size k satisfying the HUB-rule of a bipartite graph G. 
Then there is an overlap labeling of G of length 2^ — 1. 


Proof. In Section l3.2l . we described how to inductively construct a labeling b of the appropriate 
length. It remains to prove that the final labeling is an overlap labeling of G. It is easy to see that 
the initial labeling of Gi is an overlap labeling. Now we show that if t is an overlap labeling of 
Gi U ■ ■ ■ L) Gi, our construction yields an overlap labeling of Gi U • • • U Gj+i. 

Suppose first that {u,v) is an edge of Gi U • • • U Gj+i. If {u,v) is an edge of Gj+i then, by 
construction, the labels of u and v after i + \ steps are identical, and consequently they overlap. If 
(u, v) is not an edge of Gj+i, then it is an edge of Gi U • • • U Gj, and the bicliques B and B' of Gj+i 
containing u and v, respectively, are distinct. This implies that the labels of u and v after i + 1 
steps are of the form x ■ as ■ t{u) and t{v) ■ as' ■ y, respectively, for some (possibly empty) strings 
x,y,aB, and as', where t{u) and t{v) are the respective labels of u and v after i steps. Since, by 
the induction hypothesis, t{u) and t{v) overlap, so do the extended labels. 

Finally, if {u, u) € 14 x Ij, is a pair of nonadjacent vertices of Gi U • • • U Gj+i, then u and v are 
nonadjacent in Gi U • • • U Gj. By induction hypothesis, their labels after i steps, t{u) and t{v), do 
not overlap. Since u and v are also not adjacent in Gj+i, the bicliques of Gj+i containing u and v, 
say B and B', are distinct, and thus the labels of u and v after i + \ steps are of the form x-ob- t{u) 
and t{v) ■ OB' ■ y, respectively. Moreover, if both x ■ ob and ob' ■ y are nonempty then ob / us'. 
Hence, by construction, the two labels do not overlap. This completes the proof. □ 


A.4 Almost all graphs have readability I7(n/logn) 

In this subsection, we give a proof of the following theorem. 

Theorem 4.1. Almost all graphs in Bnxn (and, respectively, Bn) have readability l7(n/logn). 
When restricted to a constant sized alphabet, almost all graphs in Bnxn (and, respectively. Bn) have 
readability 17 (n). 

We will need the following reduction, implicitly shown in [BM02| . 

Property A.l fWMO^ }. Let G be a digraph or a bipartite graph, let H and S' be alphabets with 
1^1 > > 2, and let i be an overlap labeling of G over S. Then there exists an overlap labeling 

i' of G over S' such that len{£') < (21og|j;/| |i7| -|- 1) • len{i). 

The proof of Theorem l4.ll for constant sized alphabets is in the main text. For variable sized 
alphabets, we give the proof here. 

Proof (variable sized alphabets). The proof of the constant sized alphabet shows that only o(2"' ) 
graphs in Bnxn have readability at most n/3 over the binary alphabet. It therefore suffices to show 
that every graph in Bnxn of readability at most n/(151og2n) (over an unrestricted alphabet) has 
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readability at most n/3 over the binary alphabet. This is indeed the case. Suppose that G € Bnxn 
is of readability r < n/(15 log 2 n), and fix an overlap labeling £ of G of length r. Since I uses 
2nr characters in total, the alphabet size of labeling i can be assumed to be at most 2nr. By 
Propertv lA.il G has an overlap labeling ^ over the binary alphabet such that len{i') < (2 log2(2nr)+ 
l)r. Since 2 nr < we have 21og2(2nr) + 1 < 51og2n and consequently the readability of G over 
the binary alphabet is at most len{£') < 5rlog2n < n/3. The proof for is analogous and is 
omitted. □ 

A.5 Graph family with readability i7(n) 

We prove the following lemma, which was used in Section [4.21 to prove Theorem l4.2l . 

Lemma A.3. In graph H^, if i vertices have a common neighbor, then they have at least 2^“* = 
n/2* common neighbors. Moreover, if two vertices have a common neighbor, then they have exactly 
nj^ common neighbors. 

Proof. Suppose that vertices wi,...,Wi E {0,1}^ \ {0^} in the same part of the bipartition of 
iLfc have a common neighbor. Then the set X of all vectors x E {0,1}^ such that wjx = 
Ylp=i'^j\p]^\p] = 1 (mod 2) is non-empty. Notice that X C {0,1}^ is the set of solutions of the 
equation Wx = \ over the field GF{2), where W is the i x k matrix with the rows formed by the 
Wj’s, and 1 is the all-one vector of length i. The set X forms an affine subspace of the vector space 
{0,1}^ over GF{2) of dimension k — r, where r = rankfW). Therefore, vertices wi,...,Wi have 
exactly |A| = 2^“'’ common neighbors. Since r < i, we obtain |A| > 2^“*. 

If i = 2, then the rank of W is exactly 2, which implies the second part of the lemma. □ 
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